Infusion of Labeled Data into Distant Supervision for Relation Extraction

نویسندگان

  • Maria Pershina
  • Bonan Min
  • Wei Xu
  • Ralph Grishman
چکیده

Distant supervision usually utilizes only unlabeled data and existing knowledge bases to learn relation extraction models. However, in some cases a small amount of human labeled data is available. In this paper, we demonstrate how a state-of-theart multi-instance multi-label model can be modified to make use of these reliable sentence-level labels in addition to the relation-level distant supervision from a database. Experiments show that our approach achieves a statistically significant increase of 13.5% in F-score and 37% in area under the precision recall curve.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relation Extraction Using TBL with Distant Supervision

Supervised machine learning methods have been widely used in relation extraction that finds the relation between two named entities in a sentence. However, their disadvantages are that constructing training data is a cost and time consuming job, and the machine learning system is dependent on the domain of the training data. To overcome these disadvantages, we construct a weakly labeled data se...

متن کامل

Reducing Wrong Labels in Distant Supervision for Relation Extraction

In relation extraction, distant supervision seeks to extract relations between entities from text by using a knowledge base, such as Freebase, as a source of supervision. When a sentence and a knowledge base refer to the same entity pair, this approach heuristically labels the sentence with the corresponding relation in the knowledge base. However, this heuristic can fail with the result that s...

متن کامل

Noise-Clustered Distant Supervision for Relation Extraction: A Nonparametric Bayesian Perspective

For the task of relation extraction, distant supervision is an efficient approach to generate labeled data by aligning knowledge base with free texts. The essence of it is a challenging incomplete multi-label classification problem with sparse and noisy features. To address the challenge, this work presents a novel nonparametric Bayesian formulation for the task. Experiment results show substan...

متن کامل

Distant supervision for relation extraction without labeled data

Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACEstyle algorithms, and allowing the use of corpora of any size. Our experiments use Freebase, a large semantic database of several thousand relation...

متن کامل

Relation Extraction for the Food Domain without Labeled Training Data - Is Distant Supervision the Best Solution?

We examine the task of relation extraction in the food domain by employing distant supervision. We focus on the extraction of two relations that are not only relevant to product recommendation in the food domain, but that also have significance in other domains, such as the fashion or electronics domain. In order to select suitable training data, we investigate various degrees of freedom. We co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014